Evaluation of Clustering around Weighted Prototype and Genetic Algorithm for Document Categorization
نویسندگان
چکیده
Document clustering is very important in the field of text categorization. Genetic algorithm, which is an optimization based technique which can be applied for finding out the best cluster centres easily by computing fitness values of data points. While clustering around weighted prototype technique is especially helpful when proper pairwise similarities are available. This technique does not find global solution of the objective function. Experimental result shows that F-measure and Normalized mutual information of genetic algorithm is better than clustering around weighted prototype for 20 Newsgroup dataset. F-measure and accuracy of genetic algorithm is better than clustering around weighted prototype for the Reuter-21578 dataset.
منابع مشابه
Weighted Ensemble Clustering for Increasing the Accuracy of the Final Clustering
Clustering algorithms are highly dependent on different factors such as the number of clusters, the specific clustering algorithm, and the used distance measure. Inspired from ensemble classification, one approach to reduce the effect of these factors on the final clustering is ensemble clustering. Since weighting the base classifiers has been a successful idea in ensemble classification, in th...
متن کاملOptimization of fuzzy controller for an SMA-actuated artificial finger robot
The purpose of this paper is to design and optimize an intelligent fuzzy-logic controller for a three-degree of freedom (3DOF) artificial finger with shape-memory alloy (SMA) wire actuators. The robotic finger is constructed using three SMA wires as tendons to bend each phalanx of the finger around its revolute joint and three torsion springs which return the phalanxes to their original positio...
متن کاملBilateral Weighted Fuzzy C-Means Clustering
Nowadays, the Fuzzy C-Means method has become one of the most popular clustering methods based on minimization of a criterion function. However, the performance of this clustering algorithm may be significantly degraded in the presence of noise. This paper presents a robust clustering algorithm called Bilateral Weighted Fuzzy CMeans (BWFCM). We used a new objective function that uses some k...
متن کاملCluster Based Hybrid Niche Mimetic and Genetic Algorithm for Text Document Categorization
An efficient cluster based hybrid niche mimetic and genetic algorithm for text document categorization to improve the retrieval rate of relevant document fetching is addressed. The proposal minimizes the processing of structuring the document with better feature selection using hybrid algorithm. In addition restructuring of feature words to associated documents gets reduced, in turn increases d...
متن کاملA Framework for Optimal Attribute Evaluation and Selection in Hesitant Fuzzy Environment Based on Enhanced Ordered Weighted Entropy Approach for Medical Dataset
Background: In this paper, a generic hesitant fuzzy set (HFS) model for clustering various ECG beats according to weights of attributes is proposed. A comprehensive review of the electrocardiogram signal classification and segmentation methodologies indicates that algorithms which are able to effectively handle the nonstationary and uncertainty of the signals should be used for ECG analysis. Ex...
متن کامل